Population Count

Population Map

Most Victorian Population is concentrated in the Melbourne City Region. Other regions Though large have a less population

Population Table

Victoriqn Population
SA4_CODE_2016 femalepopulation malepopulation population
201 32726 34691 67417
202 32396 34054 66450
203 60660 64307 124967
204 35934 39614 75548
205 52929 57572 110501
206 159362 160819 320181
207 81814 86786 168600
208 96482 101671 198153
209 109370 122195 231565
210 71224 85167 156391
211 118179 129501 247680
212 151481 184164 335645
213 147830 178340 326170
214 62731 68190 130921
215 29867 33492 63359
216 25915 28796 54711
217 26236 29297 55533
297 0 9 9
299 765 1229 1994

Age Distribution


  • Most population is Middle Aged, 20 to 50 years.
  • Old people are vulnerable with a low population.
  • Age distribution is similar for both male and female population.

Population: Gender

Row

  • Highest people are are Health Care Professionals and the ratio between men to women is less than one.

  • Similarly, in construction more men are employed as labourers.

  • The population of women in the education sector is far exceeds that of men.

  • Management & Commerce is the field that the most population have studied.

  • More men have studied Engineering and Technology as compared to females. However, more people are employed in Health Care than in industries relating to Engineering.

  • More women have studied Management and Commerce, however more men are employed as managers.

  • Victorian population is educated upto level 7 and most are employed as professionals.

  • However, a large population is employed as labourers when the population share of people who studied below high school is very less.

  • GenderLinearModel shows the relationship between male and female populations

Column

Population by Education

  • Most of the residents achieved the level 7, which refers to the bachelor degree, and there are almost twice as many female as male.

  • Majority of male residents achieved at the level 3 and 4.

Population by Industry

Column

Population by Field

Population by Occupation

Population: Age

Column

  • As seen from the age distribution, all sectors have people in the age group 25 to 45.
  • The age group, 25-35 shares the highest population in every sector.
  • A key observation is that some people aged over 75 are still working.

Column

Population by Education

Population by Industries

Row

Population by Field

Population by Occupation

Population: Age

Row

Population by Education, Age

Education: Population
afq_level age_min population
Level 1 & 2 15 9402
Level 3 & 4 25 146297
Level 5 & 6 25 96920
Level 7 25 245613
Level 9 25 83204
Not Stated 25 70455
Level 8 35 28908

Population by Industries, Age

Industry: Population
industry age_min population
Accommodation_and_food_services 25 42103
Administrative_and_support_services 25 23086
Arts_and_recreation_services 25 13149
Construction 25 61959
Electricity_gas_water_and_waste_service 25 8039
Financial_and_insurance_services 25 32021
Health_care_and_social_assistance 25 80994
Information_media_and_telecommunications 25 14702
Not Stated 25 29901
Other_services 25 24089
Professional_scientific_and_technical_services 25 64125
Rental_hiring_and_real_estate_services 25 11796
Retail_trade 25 61803
Mining 35 2441
Wholesale_trade 35 22199
Education_and_training 45 56125
Manufacturing 45 55206
Public_administration_and_safety 45 37747
Transport_postal_and_warehousing 45 32663
Agriculture_forestry_and_fishing 55 12733

Row

Population by Field

Field: Population
field age_min population
Mixed_Field_Programmes 15 1813
Architecture_and_Building 25 42510
Creative_Arts 25 40334
Food_Hospitality_and_Personal_Services 25 42938
Health 25 67630
Information_Technology 25 37535
Management_and_Commerce 25 150571
Natural_and_Physical_Sciences 25 22171
Not Stated 25 71440
Society_and_Culture 25 80932
Agriculture_Environment 35 13016
Engineering_and_Technologies 45 77524
Education 55 44696
NA NA 896

Population by Occupation

Occupation: Population
occupation age_min population
Community_and_personal_service_workers 25 67104
Not Stated 25 11075
Professionals 25 190449
Sales_workers 25 51772
Technicians_and_trades_workers 25 99110
Managers 35 100601
Clerical_and_administrative_workers 45 89021
Labourers 45 49653
Machinery_operators_and_drivers 45 40922

Region : Sectors

Column

  • The bar plots represent the SA4 regions and its working population with respect to their education levels, field of study, industry of employment and occupations.

  • It can be observed that the region 206 had the most number of people with highest education levels which justifies that highest number of people in region 2016 were employed as professionals in their respective industries.

  • Management and commerce, engineering and technology were the fields of study for most population and agriculture, environment and mixed field programs had the least population share.

  • Health care, manufacturing and retail trade were the industries with most population while people were employed most for occupations of Professionals and Managers.

Column

Education Level: Region

Best education level of each region

Best education level of each region

Industry: Region

Column

Field: Region

Best field of each region

Best field of each region

Occupation: Region

(G52 Analysis)

Row

Chart A

  • It can be observed from both figures that overall females worked more than men. However, as the number of work-hours increased men have worked more than women.

Row

Chart B

  • It can be observed from figure that industries like health care, education and training, construction and Professional and technical services have more working population as the working hours increased. Mining, electricity, gas, water showed low working population irrespective of work hours.

(G58 Analysis)

Row

Chart C

  • It can be observed from figure that overall females worked more than men at all occupations. Although, for maximum hours worked, as number of working-hours increased, the number of men and women remained the same.

Row

Chart D

  • It can be observed from figure that the most number of employees in the SA4 regions are employed in the occupations of Professionals, Managers and Technicians and trade workers. Professionals accounted for highest number of employees for region 206, while machinery operators and drivers accounted for the least number of employees for region 213 respectively.

Maps

Column

  • The maps represent the SA4 regions and the distribution of population by their education levels, industries, field of study and occupations respectively.

  • Most population has completed education level 7 with management and commerce as their respective fields of study.

  • It can be observed that the highest number of people are employed in the occupations: Professionals, Managers and Technicians and trade workers.

  • Major industry in the city side is healthcare and the country regions are more operational in agricultural activities.

Column

Education Level: Region

Spatial Education Level Distribution

Spatial Education Level Distribution

Industry: Region

Spatial Industry Distribution

Spatial Industry Distribution

Column

Field: Region

Spatial Study Field Distribution

Spatial Study Field Distribution

Occupation: Region

Spatial Occupation Distribution

Spatial Occupation Distribution

Networks

Column

Education Level: Region

Industry: Region

Column

Field: Region

Occupation: Region

Conclusion

Column

Conclusion

The education levels, field of study, industry of employment and occupation was studied for the Victorian SA4 level populations for the distributions according to gender and sex. The tables and plots were compared to mark the covariations between the population distributions.For example, the population trend between the field of study and industry of employment. Networks were drawn based on the population weights to analyze these trends. Some of the trends like more men were employed as managers when more women had studied management were found to be interesting. Cholropeth maps were made to analyze these trends spatially.

The goal of this report is to create a data story from these statistical summaries to enumerate the facts from the data and link them to the real world. The data provided by the Australian Bureau of Statistics is an aggregated open data and in no form identifies individuals who participated in the census. The ABS aims to integrate the census data with other datasets to make this census data more interesting. Thus, we aim to do the same and bring some interesting data stories as we progress building this report.

References

Data Sources

---
title: "ETC5513 Assignment4 -Team StarWars"
output: 
  flexdashboard::flex_dashboard:
    orientation: columns
    vertical_layout: fill
    navbar:
      - { title: "About", href: "https://github.com/mohammedfaizan0014/etc5513-assignment-4-star-wars/blob/main/README.md", align: left }
    social: [ "twitter", "facebook", "menu" ]
    source_code: embed
---

```{r echo=FALSE, include=FALSE}
knitr::opts_chunk$set(fig.path = "Figures/", fig.align ="center",
                      out.width = "50%", echo = FALSE, 
                      messages = FALSE, 
                      warning = FALSE)
# Loading Libraries
library(tidyverse)
library(readr)
library(kableExtra)
library(tinytex)
library(bookdown)
library(naniar)
library(visdat)
library(citation)
library(knitr)
library(scales)
library(patchwork)
library(sf)
library(glue)
library(unglue)
library(sugarbag)
library(readxl)
library(plotly)
library(tidytext)
library(ggplot2)
library(igraph)
library(ggraph)
```


```{r}
data_path <- here::here("data/australian_census_data_2016/")
```


```{r}
data_path <- here::here("data/australian_census_data_2016/")
census_paths <- glue::glue(data_path, "/2016 Census GCP All Geographies for VIC/SA4/VIC/2016Census_G{number}{alpha}_VIC_SA4.csv", 
                         number = c("46","46","47","47","47","51","51","51","51","57","57", "52", "52", "52", "52", "58", "58"), alpha = c("A","B","A","B","C","A","B","C","D","A","B", "A","B","C","D", "A","B"))
```
```{r geopath}
geopath <- glue::glue(data_path, "/2016_SA4_shape/SA4_2016_AUST.shp")
sa4_codes<- read_csv(census_paths[2]) %>% 
                mutate(SA4_CODE_2016 = as.character(SA4_CODE_2016)) %>% 
                select(SA4_CODE_2016)
sa4_geomap <- read_sf(geopath) %>%
  right_join(sa4_codes, by=c("SA4_CODE16" = "SA4_CODE_2016"))
```
```{r g46read}
g46a<- read_csv(census_paths[1]) %>%
  select(-starts_with("P"), -contains("Tot"), -contains("nfd"), -contains("IDes")) %>%
            mutate(SA4_CODE_2016 = as.character(SA4_CODE_2016)) %>% 
  pivot_longer(cols = -c(SA4_CODE_2016),
                  names_to = "category",
                  values_to = "count") %>%
  unglue_unnest(category, 
                    c("{sex=[MF]}_{educationlevel=GradDip_and_GradCert}_{age_min=\\d+}_{age_max=\\d+}",
                      "{sex=[MF]}_{educationlevel=PGrad_Deg}_{age_min=\\d+}_{age_max=\\d+}",
                      "{sex=[MF]}_{educationlevel=BachDeg}_{age_min=\\d+}_{age_max=\\d+}",
                      "{sex=[MF]}_{educationlevel=AdvDip_and_Dip}_{age_min=\\d+}_{age_max=\\d+}",
                      "{sex=[MF]}_{educationlevel=Cert_III_IV}_{age_min=\\d+}_{age_max=\\d+}",
                      "{sex=[MF]}_{educationlevel=Cert_I_II}_{age_min=\\d+}_{age_max=\\d+}",
                      "{sex=[MF]}_{educationlevel=Lev_Edu_NS}_{age_min=\\d+}_{age_max=\\d+}",
                      "{sex=[MF]}_{educationlevel=Lev_Edu_NS|GradDip_and_GradCert|PGrad_Deg|BachDeg|AdvDip_and_Dip|Cert_III_IV|Cert_I_II}_{age_min=\\d+}ov"
                      
                      ),
                remove = FALSE) %>% 
  select(-category)
  
```


```{r}
g46a <- g46a %>% 
  mutate(afq_level =case_when(str_detect(educationlevel, "GradDip_and_GradCert") ~ "Level 8",
                            str_detect(educationlevel, "PGrad") ~ "Level 9",
                            str_detect(educationlevel, "BachDeg") ~ "Level 7",
                            str_detect(educationlevel, "AdvDip_and_Dip") ~ "Level 5 & 6",
                            str_detect(educationlevel, "Cert_III_IV") ~ "Level 3 & 4",
                            str_detect(educationlevel, "Cert_I_II") ~ "Level 1 & 2",
                            str_detect(educationlevel, "Cert_Levl_nfd") ~ "Level 3 & 4",
                            str_detect(educationlevel, "Lev_Edu_IDes") ~ "Level Inadequately Described",
                            str_detect(educationlevel, "Lev_Edu_NS") ~ "Not Stated",
                            TRUE ~ educationlevel)) %>% 
  rename(count_edu_lvl = count)
```
```{r}
g47 <- map_dfr(census_paths[3:4], ~{
  df <- read_csv(.x) %>%
      select(-starts_with("P"), -contains("Tot"), -contains("InadDes")) %>%
            mutate(SA4_CODE_2016 = as.character(SA4_CODE_2016)) %>% 
  pivot_longer(cols = -c(SA4_CODE_2016),
                  names_to = "category",
                  values_to = "count") %>%
  unglue_unnest(category, 
                    c("{sex=[MF]}_{field=(Mgnt_Com|Society_Cult|Fd_Hosp_Psnl_Svcs|MixFld_Prgm|FldStd_NS|NatPhyl_Scn|InfoTech|Eng_RelTec|ArchtBldng|Ag_Envir_Rltd_Sts|Health|Educ|Creative_Arts)}_{age_min=\\d+}_{age_max=\\d+}",
                      "{sex=[MF]}_{field=(Mgnt_Com|Society_Cult|Fd_Hosp_Psnl_Svcs|MixFld_Prgm|FldStd_NS|NatPhyl_Scn|InfoTech|Eng_RelTec|ArchtBldng|Ag_Envir_Rltd_Sts|Health|Educ|Creative_Arts)}_{age_min=\\d+}ov",
                      "{sex=[MF]}_{field=(Mgnt_Com|Society_Cult|Fd_Hosp_Psnl_Svcs|MixFld_Prgm|FldStd_NS|N{atPhyl_Scn|InfoTech|Eng_RelTec|ArchtBldng|Ag_Envir_Rltd_Sts|Health|Educ|Creative_Arts)}_{age_min=\\d+}_years_and_over"
                      
                      ),
                remove = FALSE)
})
```


```{r}
g47 <- g47 %>% 
  mutate(field =case_when(
    str_detect(field, "NatPhyl_Scn") ~ "Natural_and_Physical_Sciences",
                            str_detect(field, "InfoTech") ~ "Information_Technology",
                            str_detect(field, "Eng_RelTec") ~ "Engineering_and_Technologies",
                            str_detect(field, "ArchtBldng") ~ "Architecture_and_Building",
                            str_detect(field, "Ag_Envir_Rltd_Sts") ~ "Agriculture_Environment",
                            str_detect(field, "Health") ~ "Health",
                            str_detect(field, "Educ") ~ "Education",
                            str_detect(field, "Mgnt_Com") ~ "Management_and_Commerce",
                            str_detect(field, "Society_Cult") ~ "Society_and_Culture",
                            str_detect(field, "Creative_Arts") ~ "Creative_Arts",
                            str_detect(field, "Fd_Hosp_Psnl_Svcs") ~ "Food_Hospitality_and_Personal_Services",str_detect(field, "MixFld_Prgm") ~ "Mixed_Field_Programmes",
                            str_detect(field, "FldStd_NS") ~ "Not Stated",
                            TRUE ~ field))  %>%
  select(-category) %>%
  rename(count_field = count)
```
```{r}
g51 <- map_dfr(census_paths[6:8], ~{
  df <- read_csv(.x) %>%
      select(-starts_with("P"), -contains("Tot")) %>%
            mutate(SA4_CODE_2016 = as.character(SA4_CODE_2016)) %>% 
  pivot_longer(cols = -c(SA4_CODE_2016),
                  names_to = "category",
                  values_to = "count") %>%
  unglue_unnest(category, 
                    c("{sex=[MF]}_{industry=(Ag_For_Fshg|Mining|Manufact|El_Gas_Wt_Waste|Constru|WhlesaleTde|RetTde|Accom_food|Trans_post_wrehsg|Info_media_teleco|Fin_Insur|RtnHir_REst|Pro_scien_tec|Admin_supp|Public_admin_sfty|Educ_trng|HlthCare_SocAs|Art_recn|Oth_scs|ID_NS)}_{age_min=\\d+}_{age_max=\\d+}",
                      "{sex=[MF]}_{industry=(Ag_For_Fshg|Mining|Manufact|El_Gas_Wt_Waste|Constru|WhlesaleTde|RetTde|Accom_food|Trans_post_wrehsg|Info_media_teleco|Fin_Insur|RtnHir_REst|Pro_scien_tec|Admin_supp|Public_admin_sfty|Educ_trng|HlthCare_SocAs|Art_recn|Oth_scs|ID_NS)}_{age_min=\\d+}ov"
                     ),
                remove = FALSE)
})
```
```{r}
g51 <- g51 %>% 
  mutate(industry =case_when(
                            str_detect(industry, "Ag_For_Fshg") ~ "Agriculture_forestry_and_fishing",
                            str_detect(industry, "Manufact") ~ "Manufacturing",
                            str_detect(industry, "El_Gas_Wt_Waste") ~ "Electricity_gas_water_and_waste_service",
                            str_detect(industry, "Constru") ~ "Construction",
                            str_detect(industry, "Ag_Envir_Rltd_Sts") ~ "Agriculture_Environment",
                            str_detect(industry, "WhlesaleTde") ~ "Wholesale_trade",
                            str_detect(industry, "RetTde") ~ "Retail_trade",
                            str_detect(industry, "Accom_food") ~ "Accommodation_and_food_services",
                            str_detect(industry, "Trans_post_wrehsg") ~ "Transport_postal_and_warehousing",
                            str_detect(industry, "Info_media_teleco") ~ "Information_media_and_telecommunications",
                            str_detect(industry, "Fin_Insur") ~ "Financial_and_insurance_services",
                            str_detect(industry, "RtnHir_REst") ~ "Rental_hiring_and_real_estate_services",
                            str_detect(industry, "Pro_scien_tec") ~ "Professional_scientific_and_technical_services",
                            str_detect(industry, "Admin_supp") ~ "Administrative_and_support_services",
                            str_detect(industry, "Public_admin_sfty") ~ "Public_administration_and_safety",
                            str_detect(industry, "Educ_trng") ~ "Education_and_training",
                            str_detect(industry, "HlthCare_SocAs") ~ "Health_care_and_social_assistance",
                            str_detect(industry, "Art_recn") ~ "Arts_and_recreation_services",
                            str_detect(industry, "Oth_scs") ~ "Other_services",
                            str_detect(industry, "ID_NS") ~ "Not Stated",
                            TRUE ~ industry))  %>%
  select(-category) %>%
  rename(count_industry = count)
```
```{r}
g57 <- map_dfr(census_paths[10], ~{
  df <- read_csv(.x) %>%
      select(-starts_with("P"), -contains("Tot")) %>%
            mutate(SA4_CODE_2016 = as.character(SA4_CODE_2016)) %>% 
  pivot_longer(cols = -c(SA4_CODE_2016),
                  names_to = "category",
                  values_to = "count") %>%
  unglue_unnest(category, 
                    c("{sex=[MF]}{age_min=\\d+}_{age_max=\\d+}_{occupation=(Managers|Professionals|TechnicTrades_Wrs|CommunPersnlSvc_W|ClericalAdminis_W|Sales_W|Mach_oper_drivers|Labourers|Occu_ID_NS|TechnicTrades_W)}",
                      "{sex=[MF]}{age_min=\\d+}ov_{occupation=(Managers|Professionals|TechnicTrades_Wrs|CommunPersnlSvc_W|ClericalAdminis_W|Sales_W|Mach_oper_drivers|Labourers|Occu_ID_NS|TechnicTrades_W)}",
                      "{sex=[MF]}{age_min=\\d+}_ov_{occupation=(Managers|Professionals|TechnicTrades_Wrs|CommunPersnlSvc_W|ClericalAdminis_W|Sales_W|Mach_oper_drivers|Labourers|Occu_ID_NS|TechnicTrades_W)}"
                      ),  
                remove = FALSE)  
})
```
```{r}
g57 <- g57 %>% 
  mutate(occupation =case_when(
                            str_detect(occupation, "TechnicTrades_W") ~ "Technicians_and_trades_workers",
                            str_detect(occupation, "TechnicTrades_Wrs") ~ "Technicians_and_trades_workers",
                            str_detect(occupation, "CommunPersnlSvc") ~ "Community_and_personal_service_workers",
                            str_detect(occupation, "ClericalAdminis_W") ~ "Clerical_and_administrative_workers",
                            str_detect(occupation, "Sales_W") ~ "Sales_workers",
                            str_detect(occupation, "Mach_oper_drivers") ~ "Machinery_operators_and_drivers",
                            str_detect(occupation, "Occu_ID_NS") ~ "Not Stated",
                            TRUE ~ occupation))  %>%
  select(-category) %>%
  rename(count_occupation = count)
```

```{r}
g52 <- map_dfr(census_paths[12:14], ~{
  df <- read_csv(.x) %>%
      select(-starts_with("P"), -contains("Tot")) %>%
            mutate(SA4_CODE_2016 = as.character(SA4_CODE_2016)) %>% 
  pivot_longer(cols = -c(SA4_CODE_2016),
                  names_to = "category",
                  values_to = "count") %>%
  unglue_unnest(category, 
                    c("{sex=[MF]}_{industry=(AgriForestFish|Min|Mnfg|EGW_WS|Cnstn|WTrade|RTrade|AccomFoodS|TransPostWhse|InfoMedTelecom|FinInsurS|RentHirREserv|ProScieTechServ|AdminSupServ|PubAdmiSafety|EducTrain|HealthCareSocA|ArtRecServ|OthServ|ID_NS)}_{hr_min=\\d+}_{hr_max=\\d+}",
                      "{sex=[MF]}_{industry=(AgriForestFish|Min|Mnfg|EGW_WS|Cnstn|WTrade|RTrade|AccomFoodS|TransPostWhse|InfoMedTelecom|FinInsurS|RentHirREserv|ProScieTechServ|AdminSupServ|PubAdmiSafety|EducTrain|HealthCareSocA|ArtRecServ|OthServ|ID_NS)}_{hr_min=\\d+}",
                      "{sex=[MF]}_{industry=(AgriForestFish|Min|Mnfg|EGW_WS|Cnstn|WTrade|RTrade|AccomFoodS|TransPostWhse|InfoMedTelecom|FinInsurS|RentHirREserv|ProScieTechServ|AdminSupServ|PubAdmiSafety|EducTrain|HealthCareSocA|ArtRecServ|OthServ|ID_NS)}_{hr_min=\\d+}over"
                     ),
                remove = FALSE)
})
```
```{r}
g52 <- g52 %>% 
  mutate(industry =case_when(
                            str_detect(industry, "AgriForestFish") ~ "Agriculture_forestry_and_fishing",
                            str_detect(industry, "Min") ~ "Mining",
                            str_detect(industry, "Mnfg") ~ "Manufacturing",
                            str_detect(industry, "EGW_WS") ~ "Electricity_gas_water_and_waste_service",
                            str_detect(industry, "Cnstn") ~ "Construction",
                            str_detect(industry, "WTrade") ~ "Wholesale_trade",
                            str_detect(industry, "RTrade") ~ "Retail_trade",
                            str_detect(industry, "AccomFoodS") ~ "Accommodation_and_food_services",
                            str_detect(industry, "TransPostWhse") ~ "Transport_postal_and_warehousing",
                            str_detect(industry, "InfoMedTelecom") ~ "Information_media_and_telecommunications",
                            str_detect(industry, "FinInsurS") ~ "Financial_and_insurance_services",
                            str_detect(industry, "RentHirREserv") ~ "Rental_hiring_and_real_estate_services",
                            str_detect(industry, "ProScieTechServ") ~ "Professional_scientific_and_technical_services",
                            str_detect(industry, "AdminSupServ") ~ "Administrative_and_support_services",
                            str_detect(industry, "PubAdmiSafety") ~ "Public_administration_and_safety",
                            str_detect(industry, "EducTrain") ~ "Education_and_training",
                            str_detect(industry, "HealthCareSocA") ~ "Health_care_and_social_assistance",
                            str_detect(industry, "ArtRecServ") ~ "Arts_and_recreation_services",
                            str_detect(industry, "OthServ") ~ "Other_services",
                            str_detect(industry, "ID_NS") ~ "Not Stated",
                            TRUE ~ industry))  %>%
  select(-category) %>%
  rename(count_industry = count)
```
```{r}
g58 <- map_dfr(census_paths[16], ~{
  df <- read_csv(.x) %>%
      select(-starts_with("P"), -contains("Tot")) %>%
            mutate(SA4_CODE_2016 = as.character(SA4_CODE_2016)) %>% 
  pivot_longer(cols = -c(SA4_CODE_2016),
                  names_to = "category",
                  values_to = "count") %>%
  unglue_unnest(category, 
                    c("{sex=[MF]}_{occupation=(Mng|Pro|TTW|CPS|CA|Sal|MOD|Lab|ID_NS|)}_{hrs_min=\\d+}_{hrs_max=\\d+}",
                      "{sex=[MF]}_{occupation=(Mng|Pro|TTW|CPS|CA|Sal|MOD|Lab|ID_NS|)}_{hrs_min=\\d+}",
                      "{sex=[MF]}_{occupation=(Mng|Pro|TTW|CPS|CA|Sal|MOD|Lab|ID_NS|)}_{hrs_min=\\d+}over"
                      ),  
                remove = FALSE)  
})
```
```{r}
g58 <- g58 %>% 
  mutate(occupation =case_when(
                            str_detect(occupation, "Mng") ~ "Manager",
                            str_detect(occupation, "Pro") ~ "Professionals",
                            str_detect(occupation, "TTW") ~ "Technicians_and_trades_workers",
                            str_detect(occupation, "TechnicTrades_Wrs") ~ "Technicians_and_trades_workers",
                            str_detect(occupation, "CPS") ~ "Community_and_personal_service_workers",
                            str_detect(occupation, "CA") ~ "Clerical_and_administrative_workers",
                            str_detect(occupation, "Sal") ~ "Sales_workers",
                            str_detect(occupation, "MOD") ~ "Machinery_operators_and_drivers",
                            str_detect(occupation, "ID_NS") ~ "Not Stated",
                            TRUE ~ occupation))  %>%
  select(-category) %>%
  rename(count_occupation = count)
```

Population Count {.storyboard}
=========================================

### Population Map

```{r}
vicpopulation <- g51 %>%
  group_by(SA4_CODE_2016) %>%
  summarise(population = sum(count_industry)) %>%
  ungroup()
population <- vicpopulation %>%
  summarise(population=sum(population))
vicpopulation <- g51 %>%
  group_by(SA4_CODE_2016, sex) %>%
  summarise(population = sum(count_industry)) %>%
  ungroup(sex) %>%
  pivot_wider(names_from = sex,
              values_from = population) %>%
  rename(malepopulation = M,
         femalepopulation = `F`) %>%
  full_join(vicpopulation)
```
```{r}
vicpopulation %>% 
  full_join(sa4_geomap, 
            by = c("SA4_CODE_2016"="SA4_CODE")) %>%
  ggplot() +
  geom_sf(mapping = aes(geometry= geometry, fill=population)) +
  geom_sf_text(aes(geometry= geometry,label=SA4_CODE_2016, colour="white"), 
               check_overlap=TRUE)+
  theme_void() 
```

> Most Victorian Population is concentrated in the Melbourne City Region.
> Other regions Though large have a less population

### Population Table
```{r}
vicpopulation %>%
  kable(caption = "Victoriqn Population") %>% 
  kable_styling(bootstrap_options = c("striped", "hover"), latex_options = "hold_position")

```


### Age Distribution

```{r agedistributiong57, fig.height=4}
g57redundantage <-  g57[rep(rownames(g57), g57$count_occupation), ]

g57redundantage %>%
  ggplot()+
  geom_density(mapping = aes( x = as.numeric(age_min),
                              colour = sex, 
                              alpha = 0.5)) +
  labs(x="age") +
  scale_x_continuous()
```

***
- Most population is Middle Aged, 20 to 50 years.
- Old people are vulnerable with a low population.
- Age distribution is similar for both male and female population.


GenderLinearModel {.hidden}
=====================================

Column 
-----------------------------------------

### Occupation: Male vs Female

```{r}
g57 %>%
  pivot_wider(names_from = sex,
              values_from = count_occupation) %>%
  ggplot(mapping = aes(x = M, y = `F`, colour = occupation)) +
  geom_point() +
  labs(title = "Population: Male vs Female", 
       x = "Male Population",
       y= "Female Population") +
  scale_y_continuous(label=label_number()) +
  scale_x_continuous(label=label_number()) +
  theme(legend.position = "bottom")
```

### Occupation: Male vs Female

```{r}
g57occmfnest <- g57 %>%
  pivot_wider(names_from = sex,
              values_from = count_occupation) %>%
  select(occupation, `F`, M ) %>%
  group_by(occupation) %>%
  nest() %>%
  mutate(model = map(data, lm)) %>%
  mutate(aug = map(model, broomstick::augment)) %>%
  unnest(aug)
mvfocc <- ggplot(g57occmfnest,
       aes(x = M)) +
# index represent splitting value,
#geom_point(aes(y = `F`, colour = industry, aplha=0)) +
geom_line(aes(y = .fitted, colour = occupation)) +
  geom_abline(slope = 1) +
  theme(legend.position = "bottom")  +
  labs(title = "Population: Male vs Female", 
       x = "Male Population",
       y= "Female Population")
  # geom_text(aes(y = .fitted,label=industry, colour="white"), 
  #              check_overlap=TRUE)
ggplotly(mvfocc) %>%
  hide_legend()
```

Column 
-----------------------------------------

### Industry: Male vs Female

```{r}
g51 %>%
  pivot_wider(names_from = sex,
              values_from = count_industry) %>%
  ggplot(mapping = aes(x = M, y = `F`, colour = industry)) +
  geom_point() +
  labs(title = "Population: Male vs Female", 
       x = "Male Population",
       y= "Female Population") +
  scale_y_continuous(label=label_number()) +
  scale_x_continuous(label=label_number()) +
  theme(legend.position = "bottom")
```

### Industry: Male vs Female

```{r}
g51indmfnest <- g51 %>%
  pivot_wider(names_from = sex,
              values_from = count_industry) %>%
  select(industry, `F`, M ) %>%
  group_by(industry) %>%
  nest() %>%
  mutate(model = map(data, lm)) %>%
  mutate(aug = map(model, broomstick::augment)) %>%
  unnest(aug)
mvf <- ggplot(g51indmfnest,
       aes(x = M)) +
# index represent splitting value,
#geom_point(aes(y = `F`, colour = industry, aplha=0)) +
geom_line(aes(y = .fitted, colour = industry)) +
    geom_abline(slope = 1) +
  theme(legend.position = "bottom")  +
  labs(title = "Population: Male vs Female", 
       x = "Male Population",
       y= "Female Population")

  # geom_text(aes(y = .fitted,label=industry, colour="white"), 
  #              check_overlap=TRUE)
ggplotly(mvf) %>%
  hide_legend()
```

Population: Gender {data-navmenu="Analysis"} 
=========================================

Row 
-----------------------------------------

- Highest people are are Health Care Professionals and the ratio between men to women is less than one.
- Similarly, in construction more men are employed as labourers.
- The population of women in the education sector is far exceeds that of men.
- Management & Commerce is the field that the most population have studied.
- More men have studied Engineering and Technology as compared to females. However, more people are employed in Health Care than in industries relating to Engineering.
- More women have studied Management and Commerce, however more men are employed as managers.
- Victorian population is educated upto level 7 and most are employed as professionals.
- However, a large population is employed as labourers when the population share of people who studied below high school is very less.

- [GenderLinearModel] shows the relationship between male and female populations

Column {.tabset}
-----------------------------------------
### Population by Education

- Most of the residents achieved the level 7, which refers to the bachelor degree, and there are almost twice as many female as male.

- Majority of male residents achieved at the level 3 and 4.


```{r}
g46a %>%
  ggplot(mapping = aes(x = fct_reorder(afq_level,count_edu_lvl), y = count_edu_lvl, fill = sex)) +
  geom_col(mapping = aes(x = reorder_within(afq_level,count_edu_lvl, sex), y = count_edu_lvl, fill = sex)) +
  labs(title = "Population Share of education level", y = "Number of students") +
  scale_y_continuous(label=label_number()) +
  theme(axis.title.y = element_blank())+
  coord_flip()
```

### Population by Industry



```{r}
g51 %>%
  ggplot(mapping = aes(x = fct_reorder(industry,count_industry), y = count_industry, fill = sex)) +
  geom_col(mapping = aes(x = reorder_within(industry,count_industry, sex), y = count_industry, fill = sex)) +
  labs(title = "Population Share of Industries", y = "Number of Employees") +
  scale_y_continuous(label=label_number()) +
  theme(axis.title.y = element_blank())+
  coord_flip() 
```

Column {.tabset}
-----------------------------------------

### Population by Field



```{r}
ggplot(g47, aes(x = reorder_within(field,count_field,sex),
                      y = count_field,
                      fill = sex)) +
         geom_col() +
         labs(x = "Field",
              y = "number of observations",
              title = "Field by gender") +
  scale_y_continuous(label=label_number()) +
  theme(axis.title.y = element_blank())+
  coord_flip()

```



### Population by Occupation


```{r}
g57 %>%
  ggplot(mapping = aes(x = fct_reorder(occupation,count_occupation), y = count_occupation, fill = sex)) +
  geom_col(mapping = aes(x = reorder_within(occupation,count_occupation, sex), y = count_occupation, fill = sex)) +
  labs(title = "Population Share of Occupation", y = "Number of Employees") +
  scale_y_continuous(label=label_number()) +
    theme(axis.title.y = element_blank())+
  coord_flip() 
```

Population: Age {data-orientation=columns data-navmenu="Analysis"} 
=========================================

Column {data-width=30%}
-----------------------------------------

- As seen from the age distribution, all sectors have people in the age group 25 to 45.
- The age group, 25-35 shares the highest population in every sector.
- A key observation is that some people aged over 75 are still working.



Column {.tabset}
-----------------------------------------
### Population by Education


```{r}
popeduage <- g46a %>%
  group_by(educationlevel, age_min) %>%
  summarise(count_eduage = sum(count_edu_lvl)) %>%
  ungroup()
nodes <- data.frame(node = unique(popeduage$educationlevel),
                    category = "education level") %>%
  full_join(data.frame(node = unique(popeduage$age_min),
                    category = "age"))
popeduage <- popeduage[,c(1,2,3,1)]
networkeduage <-   graph_from_data_frame(d=popeduage,directed = TRUE, vertices = nodes)
a <- grid::arrow(type = "closed", length = unit(0.2,"inches"))
set.seed(122)
networkeduage %>%
  ggraph(layout = "stress") +
  geom_edge_link2(aes(edge_alpha = count_eduage,edge_width = count_eduage,edge_color = educationlevel),arrow = a) +
  geom_node_point(aes(size = 2, colour =category) )+
  geom_node_text(aes(label = name), repel = TRUE,  point.padding = unit(0.15, "lines")) +
theme_void() +
  theme(legend.position = "none")
```


### Population by Industries



```{r}
popindage <- g51 %>%
  group_by(industry, age_min) %>%
  summarise(count_indage = sum(count_industry)) %>%
  ungroup()
nodes <- data.frame(node = unique(popindage$industry),
                    category = "industry") %>%
  full_join(data.frame(node = unique(popindage$age_min),
                    category = "age"))
popindage <- popindage[,c(1,2,3,1)]
networkindage <-   graph_from_data_frame(d=popindage,directed = TRUE, vertices = nodes)
a <- grid::arrow(type = "closed", length = unit(0.2,"inches"))
set.seed(122)
networkindage %>%
  ggraph(layout = "stress") +
  geom_edge_link2(aes(edge_alpha = count_indage,edge_width = count_indage,edge_color = industry),arrow = a) +
  geom_node_point(aes(size = 2, colour =category) )+
  geom_node_text(aes(label = name), repel = TRUE,  point.padding = unit(0.15, "lines")) +
theme_void() +
  theme(legend.position = "none")
```

Row {.tabset}
-----------------------------------------

### Population by Field

```{r}
popfieldage <- g47 %>%
  group_by(field, age_min) %>%
  summarise(count_fieldage = sum(count_field)) %>%
  ungroup() %>%
  filter(!is.na(field))
nodes <- data.frame(node = unique(popfieldage$field),
                    category = "field") %>%
  full_join(data.frame(node = unique(popfieldage $age_min),
                    category = "age")) 
popfieldage <- popfieldage [,c(1,2,3,1)]
networkfieldage <-   graph_from_data_frame(d= popfieldage,directed = TRUE, vertices = nodes)
a <- grid::arrow(type = "closed", length = unit(0.2,"inches"))
set.seed(122)
networkfieldage %>%
  ggraph(layout = "stress") +
  geom_edge_link2(aes(edge_alpha = count_fieldage, edge_width = count_fieldage,edge_color = field),arrow = a) +
  geom_node_point(aes(size = 2, colour =category) )+
  geom_node_text(aes(label = name), repel = TRUE,  point.padding = unit(0.15, "lines")) +
theme_void() +
  theme(legend.position = "none")
```




### Population by Occupation

```{r}
popoccage <- g57 %>%
  group_by(occupation, age_min) %>%
  summarise(count_occage = sum(count_occupation)) %>%
  ungroup()
nodesocc <- data.frame(node = unique(popoccage$occupation),
                    category = "occupation") %>%
  full_join(data.frame(node = unique(popindage$age_min),
                    category = "age"))
popoccage <- popoccage[,c(1,2,3,1)]
networkoccage <-   graph_from_data_frame(d=popoccage,directed = TRUE, vertices = nodesocc)
a <- grid::arrow(type = "closed", length = unit(0.2,"inches"))
set.seed(122)
networkoccage %>%
  ggraph(layout = "stress") +
  geom_edge_link2(aes(edge_alpha = count_occage,edge_width = count_occage,edge_color = occupation),arrow = a) +
  geom_node_point(aes(size = 2, colour =category) )+
  geom_node_text(aes(label = name), repel = TRUE,  point.padding = unit(0.15, "lines")) +
theme_void() +
  theme(legend.position = "none")
```


Population: Age {data-orientation=rows data-navmenu="Analysis"} 
=========================================

Row {.tabset}
-----------------------------------------
### Population by Education, Age


```{r}
g46a %>%
  group_by(afq_level, age_min) %>%
  summarise(population=sum(count_edu_lvl)) %>%
  group_by(afq_level) %>%
  slice_max(population, n=1) %>%
  arrange(age_min)%>%
  kable(caption = "Education: Population") %>% 
  kable_styling(bootstrap_options = c("striped", "hover"), latex_options = "hold_position")
```


### Population by Industries, Age



```{r}
g51 %>%
  group_by(industry, age_min) %>%
  summarise(population=sum(count_industry)) %>%
  group_by(industry) %>%
  slice_max(population, n=1) %>%
  arrange(age_min)%>%
  kable(caption = "Industry: Population") %>% 
  kable_styling(bootstrap_options = c("striped", "hover"), latex_options = "hold_position")
```

Row {.tabset}
-----------------------------------------

### Population by Field



```{r}
g47 %>%
  group_by(field, age_min) %>%
  summarise(population=sum(count_field)) %>%
  group_by(field) %>%
  slice_max(population, n=1) %>%
  arrange(age_min)%>%
  kable(caption = "Field: Population") %>% 
  kable_styling(bootstrap_options = c("striped", "hover"), latex_options = "hold_position")
```




### Population by Occupation


```{r}
g57%>%
  group_by(occupation, age_min) %>%
  summarise(population=sum(count_occupation)) %>%
  group_by(occupation) %>%
  slice_max(population, n=1) %>%
  arrange(age_min)%>%
  kable(caption = "Occupation: Population") %>% 
  kable_styling(bootstrap_options = c("striped", "hover"), latex_options = "hold_position")
```



Region : Sectors {data-orientation=rows data-navmenu="Regions"}
=========================================

Column 
-----------------------------------------

- The bar plots represent the SA4 regions and its working population with respect to their education levels, field of study, industry of employment and occupations.

- It can be observed that the region 206 had the most number of people with highest education levels which justifies that highest number of people in region 2016 were employed as professionals in their respective industries.

- Management and commerce, engineering and technology were the fields of study for most population and agriculture, environment and mixed field programs had the least population share.

- Health care, manufacturing and retail trade were the industries with most population while people were employed most for occupations of Professionals and Managers.

Column 
-----------------------------------------
### Education Level: Region


```{r bestedu, fig.cap="Best education level of each region"}
popareaedu <- g46a %>%
  group_by(SA4_CODE_2016, afq_level) %>%
  summarise(count_eduarea = sum(count_edu_lvl)) %>%
  ungroup()

bestedu <- popareaedu %>% 
  select(1:3) %>%
  group_by(afq_level) %>%
  slice_max(count_eduarea) %>%
  arrange(SA4_CODE_2016)

bestedu %>%
  ggplot() +
  geom_col(mapping = aes(x = reorder_within(afq_level,count_eduarea, SA4_CODE_2016), y = count_eduarea, fill = afq_level)) +
  labs(title = "Region and Best education level", 
       x = "Fields with region code",
       y = "Number of Students") +
  scale_y_continuous(label=label_number()) +
  coord_flip()  +
  theme(legend.position = "none")
```


### Industry: Region

```{r popareaind, fig.cap=""}
popindarea <- g51 %>%
  group_by(SA4_CODE_2016, industry) %>%
  summarise(count_indarea = sum(count_industry)) %>%
  ungroup()
popareaindmax <- popindarea %>% 
  select(1:3) %>%
  group_by(industry) %>%
  slice_max(count_indarea) %>%
  arrange(SA4_CODE_2016)

popareaindmax %>%
  ggplot(mapping = aes(x = fct_reorder(industry,count_industry), y = count_industry, fill = sex)) +
  geom_col(mapping = aes(x = reorder_within(industry,count_indarea, SA4_CODE_2016), y = count_indarea, fill = industry)) +
  labs(title = "Region and Best Industry", y = "Number of Employees") +
  scale_y_continuous(label=label_number()) +
  coord_flip()  +
  theme(legend.position = "none")
```


Column 
-----------------------------------------

### Field: Region



```{r bestfield, fig.cap="Best field of each region"}
popareafield <- g47 %>%
  group_by(SA4_CODE_2016, field) %>%
  summarise(count_fieldarea = sum(count_field)) %>%
  ungroup()

bestfield <- popareafield %>% 
  select(1:3) %>%
  group_by(field) %>%
  slice_max(count_fieldarea) %>%
  arrange(SA4_CODE_2016)

bestfield %>%
  ggplot() +
  geom_col(mapping = aes(x = reorder_within(field,count_fieldarea, SA4_CODE_2016), y = count_fieldarea, fill = field)) +
  labs(title = "Region and Best Field", 
       x = "Fields with region code",
       y = "Number of Students") +
  scale_y_continuous(label=label_number()) +
  coord_flip()  +
  theme(legend.position = "none")
```




### Occupation: Region


```{r popareaocc, fig.cap=""}
popoccarea <- g57 %>%
  group_by(SA4_CODE_2016, occupation) %>%
  summarise(count_occarea = sum(count_occupation)) %>%
  ungroup()
popareaoccmax <- popoccarea %>% 
  select(1:3) %>%
  group_by(occupation) %>%
  slice_max(count_occarea) %>%
  arrange(SA4_CODE_2016)

popareaoccmax %>%
  ggplot(mapping = aes(x = fct_reorder(occupation,count_occarea), y = count_occarea, fill = sex)) +
  geom_col(mapping = aes(x = reorder_within(occupation,count_occarea, SA4_CODE_2016), y = count_occarea, fill = occupation)) +
  labs(title = "Region and Best Occupation", y = "Number of Employees") +
  scale_y_continuous(label=label_number()) +
  coord_flip()  +
  theme(legend.position = "none")
```


(G52 Analysis) {data-navmenu="G52"}
=============================

Row{data-width=420}
-----------------------------------------------------------------------
### Chart A

- It can be observed from both figures that overall females worked more than men. However, as the number of work-hours increased men have worked more than women.

```{r, include=FALSE}
p1 <- g52 %>% 
  mutate(hr_min = as.numeric(hr_min)) %>% 
  summarise(hr_min = sum(hr_min, na.rm = TRUE))
p2 <- g52 %>% 
  mutate(hr_max = as.numeric(hr_max)) %>% 
  summarise(hr_max = sum(hr_max, na.rm = TRUE))
```

```{r hr_plots, fig.show='hold', out.width="50%"}
p1 <- g52 %>%  
  ggplot(g52, 
           mapping = aes(x = hr_min,
                         y = count_industry,
                         fill = sex)) +
           geom_bar(stat = "identity",
                            position = "dodge") +
           theme_bw() +
           xlab("Minimum Hours") +
           ylab("Count") +
           ggtitle("Min hours worked for Industries")
p1

p2 <- g52 %>%   
  ggplot(g52, 
           mapping = aes(x = hr_max,
                         y = count_industry,
                         fill = sex)) +
           geom_bar(stat = "identity",
                            position = "dodge") +
           theme_bw() +
            xlab("Maximum Hours") +
            ylab("Count") +
           ggtitle("Max hours worked for Industries")
p2
```

Row{data-height=200}
-----------------------------------------------------------------------
### Chart B

- It can be observed from figure that industries like health care, education and training, construction and Professional and technical services have more working population as the working hours increased. Mining, electricity, gas, water showed low working population irrespective of work hours.
```{r ind_hrs}
g52redundanthrs <-  g52[rep(rownames(g52), g52$count_industry), ]


hrindcount <- g52redundanthrs %>%
  ggplot(mapping = aes(x = hr_min, y = industry)) +
  geom_count() +
  labs(title = "Population: Industries and hours", x = "Hours") +
  theme(axis.title.y = element_blank())

ggplotly(hrindcount)
```


(G58 Analysis) {data-navmenu="G52"}
=============================
Row{data-width=400}
-----------------------------------------------------------------------
### Chart C

- It can be observed from figure that overall females worked more than men at all occupations. Although, for maximum hours worked, as number of working-hours increased, the number of men and women remained the same.

```{r, include=FALSE}
p3 <- g58 %>% 
  mutate(hrs_min = as.numeric(hrs_min)) %>% 
  summarise(hrs_min = sum(hrs_min, na.rm = TRUE))
p4 <- g58 %>% 
  mutate(hrs_max = as.numeric(hrs_max)) %>% 
  summarise(hrs_max = sum(hrs_max, na.rm = TRUE))
```

```{r hrs_plots, fig.show='hold', out.width="50%"}
p3 <- g58 %>% 
  ggplot(g58, 
           mapping = aes(x = hrs_min,
                         y = count_occupation,
                         fill = sex)) +
           geom_bar(stat = "identity",
                            position = "dodge") +
           theme_bw() +
           xlab("Minimum Hours") +
           ylab("Count") +
           ggtitle("Min hours worked at Occupation")
          
p3

p4 <- g58 %>% 
  ggplot(g58, 
           mapping = aes(x = hrs_max,
                         y = count_occupation,
                         fill = sex)) +
           geom_bar(stat = "identity",
                            position = "dodge") +
           theme_bw() +
           xlab("Maximum Hours") +
           ylab("Count") +
           ggtitle("Max hours worked at Occupation")
p4
```

Row{data-height=250}
-----------------------------------------------------------------------
### Chart D

- It can be observed from figure that the most number of employees in the SA4 regions are employed in the occupations of Professionals, Managers and Technicians and trade workers. Professionals accounted for highest number of employees for region 206, while machinery operators and drivers accounted for the least number of employees for region 213 respectively.

```{r popareaoccupation, fig.cap=""}

popareaoccupation <- g58 %>%
  group_by(SA4_CODE_2016, occupation) %>%
  summarise(count_occupationarea = sum(count_occupation)) %>%
  ungroup()

popareaoccupationmax <- popareaoccupation %>% 
  select(1:3) %>%
  group_by(SA4_CODE_2016) %>%
  slice_max(count_occupationarea) %>%
  arrange(SA4_CODE_2016)
popareaoccupationmax %>% 
  full_join(sa4_geomap, 
            by = c("SA4_CODE_2016"="SA4_CODE")) %>%
  ggplot() +
  geom_sf(mapping = aes(geometry= geometry, fill=occupation)) +
  geom_sf_text(aes(geometry= geometry,label=occupation, colour="white"), check_overlap=TRUE)+
  theme_void() +
  theme(legend.position = "bottom")
bestfield <- popareaoccupation %>% 
  select(1:3) %>%
  group_by(occupation) %>%
  slice_max(count_occupationarea) %>%
  arrange(SA4_CODE_2016)
	
bestfield %>%
  ggplot() +
  geom_col(mapping = aes(x = reorder_within(occupation,count_occupationarea, SA4_CODE_2016), y = count_occupationarea, fill = occupation)) +
  labs(title = "Region and Best Occupation", y = "Number of Employees") +
  scale_y_continuous(label=label_number()) +
  coord_flip()  +
  theme(legend.position = "none")
```





Maps {data-orientation=column data-navmenu="Regions"}
=========================================

Column 
-----------------------------------------

- The maps represent the SA4 regions and the distribution of population by their education levels, industries, field of study and occupations respectively.

- Most population has completed education level 7 with management and commerce as their respective fields of study.

- It can be observed that the highest number of people are employed in the occupations: Professionals, Managers and Technicians and trade workers.

- Major industry in the city side is healthcare and the country regions are more operational in agricultural activities.

Column 
-----------------------------------------
### Education Level: Region


```{r edmap, fig.cap="Spatial Education Level Distribution"}

popareaedu <- g46a %>%
  group_by(SA4_CODE_2016, afq_level) %>%
  summarise(count_eduarea = sum(count_edu_lvl)) %>%
  ungroup()
popareaedumax <- popareaedu %>% 
  select(1:3) %>%
  group_by(SA4_CODE_2016) %>%
  slice_max(count_eduarea) %>%
  arrange(SA4_CODE_2016)

popareaedumax %>% 
  full_join(sa4_geomap, 
            by = c("SA4_CODE_2016"="SA4_CODE")) %>%
  ggplot() +
  geom_sf(mapping = aes(geometry= geometry, fill=afq_level)) +
  geom_sf_text(aes(geometry= geometry,label=afq_level), colour="black", check_overlap=TRUE)+
  theme_void() +
  scale_fill_brewer() +
  theme(legend.position = "bottom")
```


### Industry: Region

```{r indmap, fig.cap="Spatial Industry Distribution"}
popindarea <- g51 %>%
  group_by(SA4_CODE_2016, industry) %>%
  summarise(count_indarea = sum(count_industry)) %>%
  ungroup()
popindareamax <- popindarea %>% 
  select(1:3) %>%
  group_by(SA4_CODE_2016) %>%
  slice_max(count_indarea)
popindareamax %>% 
  full_join(sa4_geomap, 
            by = c("SA4_CODE_2016"="SA4_CODE")) %>%
  ggplot() +
  geom_sf(mapping = aes(geometry= geometry, fill=industry)) +
  geom_sf_text(aes(geometry= geometry,label=industry ), colour="black",check_overlap=TRUE)+
  theme_void() +
  scale_fill_brewer() +
  theme(legend.position = "bottom")
#major industry in cbd is helthcare
#major industry in country side is agriculture
```


Column 
-----------------------------------------

### Field: Region



```{r fieldmap, fig.cap="Spatial Study Field Distribution"}
popareafield <- g47 %>%
  group_by(SA4_CODE_2016, field) %>%
  summarise(count_fieldarea = sum(count_field)) %>%
  ungroup()

popareafieldmax <- popareafield %>% 
  select(1:3) %>%
  group_by(SA4_CODE_2016) %>%
  slice_max(count_fieldarea) %>%
  arrange(SA4_CODE_2016)

popareafieldmax %>% 
  full_join(sa4_geomap, 
            by = c("SA4_CODE_2016"="SA4_CODE")) %>%
  ggplot() +
  geom_sf(mapping = aes(geometry= geometry, fill=field)) +
  geom_sf_text(aes(geometry= geometry,label=field), colour="black", check_overlap=TRUE)+
  theme_void() +
  scale_fill_brewer() +
  theme(legend.position = "bottom")

```




### Occupation: Region


```{r occmap, fig.cap="Spatial Occupation Distribution"}

popareaocc <- g57 %>%
  group_by(SA4_CODE_2016, occupation) %>%
  summarise(count_occarea = sum(count_occupation)) %>%
  ungroup()
popoccareamax <- popareaocc %>% 
  select(1:3) %>%
  group_by(SA4_CODE_2016) %>%
  slice_max(count_occarea)
popoccareamax %>% 
  full_join(sa4_geomap, 
            by = c("SA4_CODE_2016"="SA4_CODE")) %>%
  ggplot() +
  geom_sf(mapping = aes(geometry= geometry, fill=occupation)) +
  geom_sf_text(aes(geometry= geometry,label=occupation), colour="black", check_overlap=TRUE)+
  theme_void() +
  scale_fill_brewer() +
  theme(legend.position = "bottom")

```




Networks {data-orientation=column data-navmenu="Regions"}
=========================================


Column 
-----------------------------------------
### Education Level: Region


```{r}
popeduarea <- g46a %>%
 group_by(SA4_CODE_2016, afq_level) %>%
  summarise(count_eduarea = sum(count_edu_lvl)) %>%
  ungroup()
nodesarea <- data.frame(node = unique(popeduarea$SA4_CODE_2016),
                    category = "area") %>%
  full_join(data.frame(node = unique(popeduarea$afq_level),
                    category = "afq_level"))
popeduarea <- popeduarea[,c(1,2,3,2)]
networkeduarea <-   graph_from_data_frame(d=popeduarea,directed = TRUE, vertices = nodesarea)
a <- grid::arrow(type = "closed", length = unit(0.2,"inches"))
networkeduarea %>%
  ggraph(layout = "stress") +
  geom_edge_link2(aes(edge_alpha = count_eduarea,edge_width = count_eduarea,edge_color = afq_level),arrow = a) +
  geom_node_point(aes(size = 2, colour =category) )+
  geom_node_text(aes(label = name), repel = TRUE,  point.padding = unit(0.15, "lines")) +
theme_void() +
  theme(legend.position = "none")
```


### Industry: Region

```{r }
popindarea <- g51 %>%
  group_by(SA4_CODE_2016, industry) %>%
  summarise(count_indarea = sum(count_industry)) %>%
  ungroup()
nodesarea <- data.frame(node = unique(popindarea$SA4_CODE_2016),
                    category = "area") %>%
  full_join(data.frame(node = unique(popindarea$industry),
                    category = "industry"))
popindarea <- popindarea[,c(1,2,3,2)]
networkindarea <-   graph_from_data_frame(d=popindarea,directed = TRUE, vertices = nodesarea)
a <- grid::arrow(type = "closed", length = unit(0.2,"inches"))
networkindarea %>%
  ggraph(layout = "stress") +
  geom_edge_link2(aes(edge_alpha = count_indarea,edge_width = count_indarea,edge_color = industry),arrow = a) +
  geom_node_point(aes(size = 2, colour =category) )+
  geom_node_text(aes(label = name), repel = TRUE,  point.padding = unit(0.15, "lines")) +
theme_void() +
  theme(legend.position = "none")
```


Column 
-----------------------------------------

### Field: Region



```{r}
popfieldarea <- g47 %>%
 group_by(SA4_CODE_2016, field) %>%
  summarise(count_fieldarea = sum(count_field)) %>%
  ungroup()
nodesarea <- data.frame(node = unique(popfieldarea$SA4_CODE_2016),
                    category = "area") %>%
  full_join(data.frame(node = unique(popfieldarea$field),
                    category = "field"))
popfieldarea <- popfieldarea[,c(1,2,3,2)]
networkfieldarea <-   graph_from_data_frame(d=popfieldarea,directed = TRUE, vertices = nodesarea)
a <- grid::arrow(type = "closed", length = unit(0.2,"inches"))
networkfieldarea %>%
  ggraph(layout = "stress") +
  geom_edge_link2(aes(edge_alpha = count_fieldarea,edge_width = count_fieldarea,edge_color = field),arrow = a) +
  geom_node_point(aes(size = 2, colour =category) )+
  geom_node_text(aes(label = name), repel = TRUE,  point.padding = unit(0.15, "lines")) +
theme_void() +
  theme(legend.position = "none")
```




### Occupation: Region


```{r }
popareaocc <- g57 %>%
  group_by(SA4_CODE_2016, occupation) %>%
  summarise(count_occarea = sum(count_occupation)) %>%
  ungroup()
nodesareaocc <- data.frame(node = unique(popareaocc$SA4_CODE_2016),
                    category = "area") %>%
  full_join(data.frame(node = unique(popareaocc$occupation),
                    category = "occupation"))
popareaocc <- popareaocc[,c(1,2,3,2)]
networkoccarea <-   graph_from_data_frame(d=popareaocc,directed = TRUE, vertices = nodesareaocc)
a <- grid::arrow(type = "closed", length = unit(0.2,"inches"))
networkoccarea %>%
  ggraph(layout = "stress") +
  geom_edge_link2(aes(edge_alpha = count_occarea,edge_width = count_occarea,edge_color = occupation),arrow = a) +
  geom_node_point(aes(size = 2, colour =category) )+
  geom_node_text(aes(label = name), repel = TRUE,  point.padding = unit(0.15, "lines")) +
theme_void() +
  theme(legend.position = "none")
```



Conclusion {data-orientation=column}
=========================================


Column 
-----------------------------------------------------------------------
Conclusion

The education levels, field of study, industry of employment and occupation was studied for the Victorian SA4 level populations for the distributions according to gender and sex. The tables and plots were compared to mark the covariations between the population distributions.For example, the population trend between the field of study and industry of employment. Networks were drawn based on the population weights to analyze these trends. Some of the trends like more men were employed as managers when more women had studied management were found to be interesting. Cholropeth maps were made to analyze these trends spatially. 

The goal of this report is to create a data story from these statistical summaries to enumerate the facts from the data and link them to the real world. The data provided by the Australian Bureau of Statistics is an aggregated open data and in no form identifies individuals who participated in the census. The ABS aims to integrate the census data with other datasets to make this census data more interesting. Thus, we aim to do the same and bring some interesting data stories as we progress building this report. 



References {data-orientation=column}
=========================================

### Data Sources

- [Australian Bureau of Statistics 2016](https://www.abs.gov.au/websitedbs/censushome.nsf/home/2016)

- Australian Bureau of Statistics (2016) 'Census GeoPackages', [GeoPackages](https://datapacks.censusdata.abs.gov.au/geopackages/), accessed May 2021. 

- Australian Bureau of Statistics (2016) 'Census DataPacks', [Census DataPacks](https://datapacks.censusdata.abs.gov.au/datapacks/), accessed May 2021.


- Australian Bureau of Statistics (2016) 'Census DataPacks', [Census DataPacks](https://www.abs.gov.au/ausstats/abs@.nsf/Lookup/by%20Subject/2011.0.55.001~2016~Main%20Features~DataPacks~5) , accessed May 2021.

- Australian Bureau of Statistics [(2016)](https://www.abs.gov.au/ausstats/abs@.nsf/Lookup/by%20Subject/2900.0~2016~Main%20Features~Understanding%20the%20Census%20and%20Census%20Data~1)

- Australian Statistical Geography Standard [(ASGS)](https://www.abs.gov.au/websitedbs/D3310114.nsf/home/Australian+Statistical+Geography+Standard+(ASGS))